A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees

نویسندگان

  • Mark Silberstein
  • Omer Weissbrod
  • Lars Otten
  • Anna Tzemach
  • Andrei Anisenia
  • Oren Shtark
  • Dvir Tuberg
  • Eddie Galfrin
  • Irena Gannon
  • Adel Shalata
  • Zvi U. Borochowitz
  • Rina Dechter
  • Elizabeth Thompson
  • Dan Geiger
چکیده

MOTIVATION The use of dense single nucleotide polymorphism (SNP) data in genetic linkage analysis of large pedigrees is impeded by significant technical, methodological and computational challenges. Here we describe Superlink-Online SNP, a new powerful online system that streamlines the linkage analysis of SNP data. It features a fully integrated flexible processing workflow comprising both well-known and novel data analysis tools, including SNP clustering, erroneous data filtering, exact and approximate LOD calculations and maximum-likelihood haplotyping. The system draws its power from thousands of CPUs, performing data analysis tasks orders of magnitude faster than a single computer. By providing an intuitive interface to sophisticated state-of-the-art analysis tools coupled with high computing capacity, Superlink-Online SNP helps geneticists unleash the potential of SNP data for detecting disease genes. RESULTS Computations performed by Superlink-Online SNP are automatically parallelized using novel paradigms, and executed on unlimited number of private or public CPUs. One novel service is large-scale approximate Markov Chain-Monte Carlo (MCMC) analysis. The accuracy of the results is reliably estimated by running the same computation on multiple CPUs and evaluating the Gelman-Rubin Score to set aside unreliable results. Another service within the workflow is a novel parallelized exact algorithm for inferring maximum-likelihood haplotyping. The reported system enables genetic analyses that were previously infeasible. We demonstrate the system capabilities through a study of a large complex pedigree affected with metabolic syndrome. AVAILABILITY Superlink-Online SNP is freely available for researchers at http://cbl-hap.cs.technion.ac.il/superlink-snp. The system source code can also be downloaded from the system website. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SNP-based linkage analysis in extended pedigrees: comparison between two alternative approaches.

BACKGROUND Linkage analysis on extended pedigrees is often challenged by the high computational demand of exact identity-by-descent (IBD) matrix reconstruction. When such an analysis becomes not feasible, two alternative solutions are contrasted: a full pedigree analysis based on approximate IBD estimation versus a pedigree splitting followed by exact IBD estimation. A multiple splitting (MS) a...

متن کامل

Unifying ideas for non-parametric linkage analysis.

OBJECTIVES Non-parametric linkage analysis (NPL) exploits marker allele sharing among affected relatives to map genes influencing complex traits. Computational barriers force approximate analysis on large pedigrees and the adoption of a questionable perfect data assumption (PDA) in assigning p values. To improve NPL significance testing on large pedigrees, we examine the adverse consequences of...

متن کامل

Mutation Analysis of GJB2 and GJB6 Genes and the Genetic Linkage Analysis of Five Common DFNB Loci in the Iranian Families with Autosomal Recessive Non-Syndromic Hearing Loss

The incidence of pre-lingual hearing loss (HL) is about 1 in 1000 neonates. More than 60% of cases are inherited. Non-syndromic HL (NSHL) is extremely heterogeneous: more than 130 loci have been identified so far. The most common form of NSHL is the autosomal recessive form (ARNSHL). In this study, a cohort of 36 big ARNSHL pedigrees with 4 or more patients from 7 provinces of Iran was investig...

متن کامل

The Pattern of Linkage Disequilibrium in Livestock Genome

Linkage disequilibrium (LD) is bases of genomic selection, genomic marker imputation, marker assisted selection (MAS), quantitative trait loci (QTL) mapping, parentage testing and whole genome association studies. The Particular alleles at closed loci have a tendency to be co-inherited. In linked loci this pattern leads to association between alleles in population which is known as LD. Two metr...

متن کامل

Estimating the power of variance component linkage analysis in large pedigrees.

Variance component linkage analysis is commonly used to map quantitative trait loci (QTLs) in general pedigrees. Large pedigrees are especially attractive for these studies because they provide greater power per genotyped individual than small pedigrees. We propose accurate and computationally efficient methods to calculate the analytical power of variance component linkage analysis that can ac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 29 2  شماره 

صفحات  -

تاریخ انتشار 2013